Generating Frequent Closed Item Sets Based on Zero-suppressed BDDs
نویسنده
چکیده
(Abstract) Frequent item set mining is one of the fundamental techniques for knowledge discovery and data mining. In the last decade, a number of efficient algorithms for frequent item set mining have been presented, but most of them focused on just enumerating the item set patterns which satisfy the given conditions, and it was a different matter how to store and index the result of patterns for efficient data analysis. Recently, we proposed a fast algorithm of extracting all frequent item set patterns from transaction databases and simultaneously indexing the result of huge patterns using Zero-suppressed BDDs (ZBDDs). That method, ZBDD-growth, is not only enumerating/listing the patterns efficiently, but also indexing the output data compactly on the memory to be analyzed with various algebraic operations. In this paper, we present a variation of ZBDD-growth algorithm to generate frequent closed item sets. This is a quite simple modification of ZBDD-growth, and additional computation cost is relatively small compared with the original algorithm for generating all patterns. Our method can conveniently be utilized in the environment of ZBDD-based pattern indexing.
منابع مشابه
N-Gram Analysis Based on Zero-Suppressed BDDs
In present paper, we propose a new method of n-gram analysis using ZBDDs (Zero-suppressed BDDs). ZBDDs are known as a compact representation of combinatorial item sets. Here, we newly apply the ZBDD-based techniques for efficiently handling sets of sequences. Using the algebraic operations defined over ZBDDs, such as union, intersection, difference, etc., we can execute various processings and/...
متن کاملUnordered N-gram Representation Based on Zero-suppressed BDDs for Text Mining and Classification
In this paper, we present a new method to analyze unordered n-grams by using ZBDDs (Zero-suppressed BDDs). n-grams have been used not only for text analysis but also for text indexing in some search engines. We newly use a variation of n-grams called unordered n-grams. Unordered n-grams abstract from the position of the characters in each n-gram, i.e., they just deal with the range of ordinary ...
متن کاملVSOP (Valued-Sum-Of-Products) Calculator Based on Zero-Suppressed BDDs
(Abstract) Recently, Binary Decision Diagrams (BDDs) are widely used for efficiently manipulating large-scale Boolean function data. BDDs are also applied for handling combinatorial item set data. Zero-suppressed BDDs (ZBDDs) are special type of BDDs which are suitable for implicitly handling large-scale combinatorial item set data. In this paper, we present VSOP program developed for calculati...
متن کاملSymmetric Item Set Mining Using Zero-suppressed BDDs
(Abstract) In this paper, we propose a method for discovering hidden information from large-scale item set data based on the symmetry of items. Symmetry is a fundamental concept in the theory of Boolean functions, and there have been developed fast symmetry checking methods based on BDDs (Binary Decision Diagrams). Here we discuss the property of symmetric items in data mining problems, and des...
متن کاملSymmetric Item Set Mining Method Using ZBDDs and Application to Biological Data
(Abstract) In this paper, we present a method of finding symmetric items in a combinatorial item set database. The techniques for finding symmetric variables in Boolean functions have been studied for long time in the area of VLSI logic design, and the BDD (Binary Decision Diagram)-based methods are presented to solve such a problem. Recently , we have developed an efficient method for handling...
متن کامل